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JEFFERY, Administrative Patent Judge. 

DECISION ON APPEAL^ 



^ Appellants list Application No. 10/528,636 on page 1 of the Appeal Brief 
but also has Application No. 10/877,581 in the Brief's header (see Br. 2-17) 
and discusses the real party for AppHcation No. 10/428,973 (see Br. 3). 
Based on the issues presented, we will assume that this appeal relates to 
Application No. 10/528,636 which has coiTesponding claims. 
^ The two-month time period for filing an appeal or commencing a civil 
action, as recited in 37 C.F.R. § 1.304, or for filing a request for rehearing, 
as recited in 37 C.F.R. § 41.52, begins to run from the "MAIL DATE" 
(paper delivery mode) or the "NOTIFICATION DATE" (electronic delivery 
mode) shown on the PTOL-90A cover letter attached to this decision. 



Appeal 2009-009929 
Application 10/528,636 



Appellants appeal under 35 U.S.C. § 134(a) from the Examiner's 
rejection of claims 1-3 and 6. The Examiner indicates that claims 4 and 5 
contain allowable subject matter. Ans. 7. We have jurisdiction under 
35 U.S.C. § 6(b). We reverse. 

STATEMENT OF THE CASE 

Appellants' invention involves clustering key images using spatial and 
temporal attributes. See generally Spec. 1. Claim 1 is reproduced below 
with a key limitation emphasized: 

1. Method of clustering images of a video sequence consisting of 
shots and represented by a graph-like structure, a node of the graph 
representing a shot or a class of shots defined by key images and the 
nodes being connected by edges, comprising the following iteration: 

selecting an edge ak connecting nodes and nj[,] 

calculating a potential of node n^, 

merging of the two nodes and n,-, as a function of the distances 
between the attributes of the key images defining the class of shots of 
node Ui and those of the key images defining the class of shots of node 
Uj and as a function of the temporal distance of these key images, 

calculating a potential of each edge connecting the merged node to 
another node of the graph previously connected to nodes ni or nj, as a 
function of the distances between the attributes of the key images 
defining the class of shots of the merged node and those of the key 
images defining the class of shots of the other node and as a function 
of the temporal distance between these key images, the new class of 
shots associated with the merged node comprising the key images of 
the classes of shots of the merged nodes, and 

merging of the two nodes and validation of the new graph if an 
energy of this graph, which is the sum of the potentials of the nodes 
and of the edges, is less than the energy of the graph before merging. 
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The Examiner relies on the following as evidence of unpatentability: 
Yeo US 5,821,945 Oct. 13, 1998 

Oguz US 7,054,367 B2 May 30, 2006 

(filed Dec. 31,2001) 

Geiger US 7,212,201 Bl May 1, 2007 

(filed Sept. 20, 2000) 

The Rejection 

The Examiner rejected claims 1-3 and 6^ under 35 U.S.C. § 103(a) as 
unpatentable over Yeo, Oguz, and Geiger. Ans. 3-7."^ 



The Contentions 
Regarding independent claim 1, the Examiner finds that Yeo teaches 
merging nodes ni and nj as a function of the distances between the key 
images' attributes defining the class of shots for nodes n^ and nj but fails to 
discuss additionally merging these nodes as function of the key images' 
temporal distance. Ans. 4. The Examiner relies on Oguz to teach this 
missing limitation. Ans. 4-5. 



^ The Examiner states "claims 1-2 and 3-6" are rejected under § 103. 
Ans. 3. However, the Examiner only discusses claims 1-3 and 6 in the body 
of the rejection (Ans. 3-7) and also indicates that claims 4 and 5 are 
allowable if rewritten in independent form (Ans. 7). We therefore 
presume — as do Appellants (Br. 5) — that the Examiner intended to reject 
only claims 1-3 and 6 and only address arguments thereto. 

Throughout this opinion, we refer to (1) the Appeal Brief filed November 
17, 2008, and (2) the Examiner's Answer mailed January 26, 2009. 
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Appellants argue that Yeo does not suggest a temporal distance or 
merging nodes according to potentials which are function of temporal 
distances. See Br. 1 1. Appellants also contend that Oguz does not discuss a 
temporal distance relating to key images (Br. 9), and that the claim includes 
merging nodes as a function of this temporal distance (see Br, 10). 

The issue before us, then, is as follows: 

ISSUE 

(1) Under § 103, has the Examiner erred in rejecting claim 1 by 
finding that Yeo, Oguz, and Geiger collectively would have taught or 
suggested merging node Uj and Uj as a function of the temporal distance of 
key images that define ni and nj's class shots? 

FINDINGS OF FACT (FF) 

1. Yeo discloses video shots that exhibit visual, spatial, and 

temporal similarities are clustered into scenes, where each scene contains 
one or more shots of similar content. Using the clustering results and 
temporal information associated with each shot, the system builds a graph 
with nodes representing scenes and edges representing the story's progress 
from one scene to the next. Yeo, col. 5, 11. 32-40. 

2. In the context of clustering video shots, Yeo measures shot 
similarities or a proximity index based on image attributes, such as color, 
spatial correlation, and shape. Yeo describes first grouping shots that are 
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most similar together, and then grouping other shots by their proximity 
values. Yeo, col. 6, 11. 26-31; col. 8, 11. 44-62; col. 9, 11. 4-13. 

3. Yeo also discusses temporal variations in video shots when 
camera motions are prominent (e.g., first shot zooms in to Mr. A and a 
second shot zooms out to Ms. B). In this case, a representative image is not 
sufficient to analyze this image set it represents. Yeo explains the system 
chooses a greatly reduced representative set of frames to represent a video 
shot so as to reduce computing loads. Yeo also states that clustering is not 
confined to only one such representative image. Yeo, col. 8, 11. 9-41. 

4. Yeo provides two examples of a hierarchical organization of 
shots. Example 1 is a tree representation of shots in time and has a temporal 
relation defined by an edge. Example 2 is a directed graph representation of 
shots that has cluster relations governed by a temporal ordering of shots 
within two clusters. Yeo also discusses the edge relationships induced by 
temporal precedence at level 0 is preserved as one moves up the hierarchy. 
Yeo, column 4, 11. 1-58. 

5. Oguz teaches detecting a scene change in a video sequence by 
computing a coincidence coefficient that indicates the degree of coincidence 
between significant video frame edges in a current frame and a prior frame 
to within a distance of a small number of blocks. The number of blocks can 
be determined based on the temporal distance between the current and prior 
frames and the motion in the scene in that temporal vicinity. See Oguz, col. 
6, 11. 7-35; col. 8,11. 8-32; Fig. 2. 
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ANALYSIS 

Based on the record before us, we find error in the Examiner's 
obviousness rejection of claim 1 which calls for, in pertinent part, merging 
nodes as a function of the temporal distance of key images that define nodes 
ni and nj's class shots. As the Examiner acknowledges (Ans, 13-14), Yeo 
discusses using temporal information when representing shots graphically. 
Yeo also clusters or merges nodes (e.g., a node representing a shot) based on 
temporal similarities. See FF 1. Yeo, however, provides no more details 
regarding these "temporal similarities," but only describes measuring shot 
similarities based on image attributes, such as color, spatial correlation, and 
shape. See FF 2. Also, while Yeo discusses temporal variations {see FF 3), 
this temporal variation relates to selecting a representative frame set for a 
shot to reduce computations — not a temporal distance of these representative 
frames for clustering purposes. See id. 

The Examiner also cites to Yeo's discussion (see Ans. 13-14) of 
hierarchical shot organizations and discusses edge relationships induced by a 
temporal precedence. See FF 4. However, this temporal precedence does 
not address a temporal distance of key images, let alone how this precedence 
is used to cluster or merge nodes as claim 1 requires. Yeo discloses a 
hierarchical tree presentation has a temporal relation defined by an edge. 
See id. This temporal relation relates to edges and not a node's key, and thus 
does not teach merging nodes based on this temporal relationship. See id. 
Yeo also discusses a directed graph representation of shots that include 
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cluster (e.g., nodes) relations governed by a temporal ordering of shots 
within two clusters. See id. While Yeo temporal shot ordering within a 
node (e.g., a cluster) governs node relationships, Yeo does not teach or 
suggest this temporal ordering is a temporal distance of key images or that 
the nodes (e.g., clusters) merged as a function of the temporal ordering. We 
therefore disagree with the Examiner's assertion that Yeo's temporal 
ordering of shots "seems to [be] equivalent to 'the ordering of shots' 
temporal values/distance." Ans. 14. 

On the other hand, Oguz detects a scene change (e.g., detecting edges 
between scenes or nodes) by computing a coincidence coefficient. See FF 1, 
5. Specifically, the coincidence coefficient indicates the coincidences 
between significant edges in a current frame and in a prior frame to within a 
distance of a small number of blocks, and the blocks are determined using a 
temporal distance between the current and prior frames. See id. Thus, 
because Oguz's coincidence coefficient considers the blocks' distance and 
the block distance is determined using a temporal distance between frames, 
the temporal distance is also a component used to compute a coincidence 
coefficient. See id. Additionally, Appellants admit as much. See Br. 10 
(second emphasis added) ("the disclosed temporal distance in Oguz ... is 
used to scale motion . . . and to compute a coincidence coefficient"). 

Nonetheless, Oguz only teaches using this temporal distance to detect 
when a scene changes. See FF 5. Thus, at best, Oguz suggests calculating a 
potential of an edge (e.g., where a scene representing a shot class changes) 
7 
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as a function of a temporal distance. We fail to see how this teaching further 
teaches merging nodes (e.g., shots or scenes) as a function of this temporal 
distance as recited in claim 1. Nor has the Examiner provided adequate 
evidence that an ordinarily skilled artisan would have recognized that 
Oguz's ability to detect scene changes based, in part, on a temporal distance 
also suggests clustering or merging nodes as a function of this key image 
distance. See Am. 4-5. The Examiner states Oguz's system allows for 
"different categorization/clustering method[s]." Ans. 5. Even so, we do not 
find that Oguz teaches or suggests clustering or merging nodes based on the 
recited temporal distance is known in the art and thus that merging nodes as 
a function of a temporal distance of key images that define node Ui and Uj's 
class shots as required by claim 1. See Ans. 14. 

For the foregoing reasons, Appellants have persuaded us of en-or in 
the obviousness rejection of: (1) independent claim 1 and (2) dependent 
claims 2, 3, and 6 for similar reasons. 

CONCLUSION 
The Examiner erred in rejecting claims 1-3 and 6 under § 103. 
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ORDER 

The Examiner's decision rejecting claims 1-3 and 6 is reversed. 



REVERSED 



rwk 

Robert D. Shedd, Patent Operations 
THOMSON Licensing LLC 
P.O. Box 5312 
Princeton, NJ 08543-5312 
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